Skip to content

Preserve model backend suffix in generated tasks#120

Open
njzjz wants to merge 5 commits into
develfrom
preserve-model-backend-suffix
Open

Preserve model backend suffix in generated tasks#120
njzjz wants to merge 5 commits into
develfrom
preserve-model-backend-suffix

Conversation

@njzjz

@njzjz njzjz commented Jun 19, 2026

Copy link
Copy Markdown
Member

Summary

  • preserve the input model suffix when copying or linking DeepMD models into generated tasks, e.g. model.pth -> graph.pth
  • update HTI/GDI/TI/equi/MTI/water workflow submission paths to forward/link the actual generated model filename
  • add tests for suffix preservation and model filename lookup

Tests

  • PYTHONPATH=.. pytest -q test_lib_utils.py test_hti_make_task.py test_gdi_make_task.py test_equi_make_task.py test_ti_make_task.py test_hti_liq_make_task.py test_hti_liq_gen_lammps_input.py test_hti_water_gen_lammps_input.py test_ti_water_gen_lammps_input.py test_hti_gen_lammps_input.py
  • python -m ruff check --select I,F --ignore F841 dpti/lib/utils.py dpti/equi.py dpti/gdi.py dpti/hti.py dpti/hti_water.py dpti/hti_liq.py dpti/hti_ice.py dpti/ti.py dpti/mti.py dpti/relax.py dpti/old_equi.py tests/test_lib_utils.py tests/test_hti_make_task.py tests/test_gdi_make_task.py tests/test_ti_make_task.py tests/test_equi_make_task.py workflow/DpFreeEnergy.py workflow/DpFreeEnergyWater.py
  • python -m compileall -q dpti workflow tests/test_lib_utils.py tests/test_hti_make_task.py tests/test_gdi_make_task.py tests/test_ti_make_task.py tests/test_equi_make_task.py

Full pytest -q currently reports 116 passed, 1 error; the remaining error is an existing pytest collection issue in tests/test_hti_ff_spring.py:54, where a module-level test function has a self parameter.

Summary by CodeRabbit

Release Notes

  • New Features

    • Model/graph artifact filenames are now resolved dynamically (including support for non-graph.pb extensions) and are correctly propagated through job/task setup, linking, and submission forwarding.
    • Phase-specific graph/model naming is supported for one-phase and multi-phase workflows.
  • Bug Fixes

    • Improved correctness of which model artifact is staged/forwarded into downstream tasks (no longer always graph.pb).
  • Tests

    • Added/updated unit tests to cover dynamic model filename resolution and phase-specific naming behavior.

@njzjz njzjz requested a review from Yi-FanLi June 19, 2026 16:53
@njzjz njzjz linked an issue Jun 19, 2026 that may be closed by this pull request
@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 104 lines in your changes missing coverage. Please review.
✅ Project coverage is 0.00%. Comparing base (4a6fed5) to head (103e64f).
⚠️ Report is 1 commits behind head on devel.

Files with missing lines Patch % Lines
dpti/hti.py 0.00% 25 Missing ⚠️
dpti/equi.py 0.00% 12 Missing ⚠️
dpti/mti.py 0.00% 11 Missing ⚠️
dpti/ti.py 0.00% 11 Missing ⚠️
dpti/gdi.py 0.00% 10 Missing ⚠️
dpti/hti_liq.py 0.00% 10 Missing ⚠️
dpti/hti_water.py 0.00% 9 Missing ⚠️
dpti/lib/utils.py 0.00% 5 Missing ⚠️
dpti/hti_ice.py 0.00% 4 Missing ⚠️
dpti/relax.py 0.00% 4 Missing ⚠️
... and 1 more
Additional details and impacted files
@@          Coverage Diff          @@
##           devel    #120   +/-   ##
=====================================
  Coverage   0.00%   0.00%           
=====================================
  Files         25      25           
  Lines       6325    6390   +65     
=====================================
- Misses      6325    6390   +65     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@njzjz, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 14 minutes and 34 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b156cd15-e993-4587-95cf-0d926ea2c8ff

📥 Commits

Reviewing files that changed from the base of the PR and between 25e9b37 and 103e64f.

📒 Files selected for processing (2)
  • dpti/mti.py
  • tests/test_hti_make_task.py
📝 Walkthrough

Walkthrough

Introduces a get_model_filename(model, prefix) utility that derives a model artifact filename from the input model path extension instead of defaulting to graph.pb. This resolved filename is propagated through task generation (make_tasks, _make_tasks, refine_tasks), task execution (run_task, _get_task_model_file helpers), and workflow dispatch (DpFreeEnergy.py, DpFreeEnergyWater.py) across all simulation modules.

Changes

Dynamic model filename resolution

Layer / File(s) Summary
get_model_filename utility and tests
dpti/lib/utils.py, tests/test_lib_utils.py
Adds get_model_filename(model, prefix="graph") returning {prefix}.pb when model is None or {prefix}{suffix} derived from the model path extension. Tests cover suffix preservation, prefix argument, and None default.
Task generation in hti module
dpti/hti.py, tests/test_hti_make_task.py
make_tasks and _make_tasks compute the model filename via get_model_filename, symlink/copy the model into job directories and per-task directories using the derived filename, update jdata["model"] to the derived name, and pass the computed filename to _gen_lammps_input. Tests verify suffix preservation in generated LAMMPS inputs.
Task generation in hti_liq, hti_water, hti_ice, gdi, old_equi, relax
dpti/hti_liq.py, dpti/hti_water.py, dpti/hti_ice.py, dpti/gdi.py, dpti/old_equi.py, dpti/relax.py, tests/test_gdi_make_task.py
Each module calls get_model_filename(model) in make_tasks and refine_tasks to compute the local artifact name, symlinks/copies the model under that name into job and per-task directories, and updates jdata["model"] and LAMMPS input generation calls. GDI adds per-phase naming via _get_phase_graph_file using a graph.{phase_idx} prefix and updates make_dpdt per-phase forward_files. Tests assert phase-specific graph names and suffix preservation.
Task execution: per-task model file helpers and run_task
dpti/equi.py, dpti/hti.py, dpti/mti.py, dpti/ti.py, tests/test_equi_make_task.py, tests/test_ti_make_task.py
Adds _get_task_model_file() helpers in equi, hti, mti, and ti that read per-task settings JSON files to recover the model basename at runtime. run_task in each module uses the resolved filename to build symlink commands (with safe quoting via shlex) and populate forward_files/forward_common_files instead of unconditional graph.pb. Tests verify settings-file-based filename extraction.
Workflow forward_files wiring
workflow/DpFreeEnergy.py, workflow/DpFreeEnergyWater.py
Both workflow files add _model_file_from_json and _forward_files_with_model helpers. All Task constructions for NPT_sim, NVT_sim, HTI_sim, and TI_sim replace hard-coded ["graph.pb"] forward lists with calls to these helpers that derive the model filename from the relevant settings JSON.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • deepmodeling/dpti#110: Adds _graph_link_command in dpti/hti.py for the one-step graph.pb symlink path, which this PR generalizes to accept a dynamically resolved model_file argument.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Preserve model backend suffix in generated tasks' accurately and concisely summarizes the core objective of the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch preserve-model-backend-suffix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dpti/hti.py`:
- Around line 1510-1515: The _graph_link_command function constructs a shell
command by directly interpolating graph_relpath and model_file into an ln -s
command without proper quoting, which creates a security and stability issue if
these paths contain spaces or shell metacharacters. Fix this by using shell
quoting (such as shlex.quote) to properly escape both the graph_relpath source
argument and the model_file destination argument in the return statement, and
add the -f flag to the ln -s command to make it idempotent. Apply the same fix
to the similar code mentioned in the comment at lines 1547-1550.

In `@dpti/mti.py`:
- Around line 339-343: The model_file variable is being interpolated directly
into shell ln -s commands without proper quoting, creating a security and
robustness issue. In the shell command fragments where link_model is assigned
(around lines 339 and 341, and also in the similar block at lines 374-380), wrap
both instances of the model_file variable with double quotes in the ln -s
commands to properly escape shell-sensitive characters. Additionally, add the -f
flag to the ln -s command to make the linking idempotent, so it will overwrite
existing links without error. Apply these same quoting and -f flag changes to
all occurrences of link_model assignments that use model_file.

In `@dpti/ti.py`:
- Around line 1081-1085: The model_file variable is directly interpolated into
the shell command string without proper quoting, creating a vulnerability to
command injection or path parsing issues if the filename contains special
characters or spaces. In the command string that includes the ln -s symlink
creation, wrap both occurrences of model_file with appropriate shell quotes
(single or double quotes) to properly escape the path. This applies to the
conditional command assignment where model_file is used twice in the symlink
command.
🪄 Autofix (Beta)

✅ Autofix completed


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6ad7ce01-416a-4024-9b69-d9f051ce24f1

📥 Commits

Reviewing files that changed from the base of the PR and between 4a6fed5 and 94eaf11.

📒 Files selected for processing (18)
  • dpti/equi.py
  • dpti/gdi.py
  • dpti/hti.py
  • dpti/hti_ice.py
  • dpti/hti_liq.py
  • dpti/hti_water.py
  • dpti/lib/utils.py
  • dpti/mti.py
  • dpti/old_equi.py
  • dpti/relax.py
  • dpti/ti.py
  • tests/test_equi_make_task.py
  • tests/test_gdi_make_task.py
  • tests/test_hti_make_task.py
  • tests/test_lib_utils.py
  • tests/test_ti_make_task.py
  • workflow/DpFreeEnergy.py
  • workflow/DpFreeEnergyWater.py

Comment thread dpti/hti.py Outdated
Comment thread dpti/mti.py Outdated
Comment thread dpti/ti.py
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Fixes Applied Successfully

Fixed 3 file(s) based on 3 unresolved review comments.

Files modified:

  • dpti/hti.py
  • dpti/mti.py
  • dpti/ti.py

Commit: 25e9b37db69ee8fa0d99f224f2d5cf4c97b0cd71

The changes have been pushed to the preserve-model-backend-suffix branch.

Time taken: 5m 34s

Fixed 3 file(s) based on 3 unresolved review comments.

Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
dpti/ti.py (1)

1081-1084: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Shell quoting is still bypassable for crafted filenames.

At Line 1082, wrapping model_file in double quotes is not enough if the basename contains quotes; this can still break command parsing/injection safety. Please shell-escape both path operands before interpolation.

Suggested patch
 def run_task(task_name, machine_file):
+    import shlex
@@
-            command=(
-                f'ln -sf "../{model_file}" "{model_file}"; {mdata["command"]} -in in.lammps'
-                if model_file
-                else f"{mdata['command']} -in in.lammps"
-            ),
+            command=(
+                f"ln -sf {shlex.quote(f'../{model_file}')} {shlex.quote(model_file)}; {mdata['command']} -in in.lammps"
+                if model_file
+                else f"{mdata['command']} -in in.lammps"
+            ),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dpti/ti.py` around lines 1081 - 1084, The command construction at the `ln
-sf` statement does not properly shell-escape the model_file variable before
interpolation, which allows shell injection if the filename contains special
characters like quotes. Use a shell escaping function (such as shlex.quote() in
Python) to properly escape both the symlink source path (the quoted
"../{model_file}" part) and the symlink destination path (the quoted
"{model_file}" part) before interpolating them into the f-string command. This
ensures that any special characters in the filename cannot break out of the
quoted context or inject arbitrary commands.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@dpti/ti.py`:
- Around line 1081-1084: The command construction at the `ln -sf` statement does
not properly shell-escape the model_file variable before interpolation, which
allows shell injection if the filename contains special characters like quotes.
Use a shell escaping function (such as shlex.quote() in Python) to properly
escape both the symlink source path (the quoted "../{model_file}" part) and the
symlink destination path (the quoted "{model_file}" part) before interpolating
them into the f-string command. This ensures that any special characters in the
filename cannot break out of the quoted context or inject arbitrary commands.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 305627ba-f5f6-43e1-820f-717d3362e822

📥 Commits

Reviewing files that changed from the base of the PR and between 94eaf11 and 25e9b37.

📒 Files selected for processing (3)
  • dpti/hti.py
  • dpti/mti.py
  • dpti/ti.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • dpti/mti.py
  • dpti/hti.py

@njzjz

njzjz commented Jun 19, 2026

Copy link
Copy Markdown
Member Author

pre-commit.ci autofix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Support different backends in downstream workflows Support different backends for DeePMD-kit

1 participant